Recognition sequences for i-crei-derived meganucleases and uses thereof

ABSTRACT

Methods of cleaving double-stranded DNA that can be recognized and cleaved by a rationally-designed, I-CreI-derived meganuclease are provided. Also provided are recombinant nucleic acids, cells, and organisms containing such recombinant nucleic acids, as well as cells and organisms produced using such meganucleases. Also provided are methods of conducting a custom-designed, I-CreI-derived meganuclease business.

REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. patent application Ser. No.17/065,340, filed Oct. 7, 2020, which is a Continuation of U.S. patentapplication Ser. No. 16/299,068, filed Mar. 11, 2019, which is aContinuation of U.S. patent application Ser. No. 15/472,175 (nowpatented as U.S. Pat. No. 10,273,524), filed Mar. 28, 2017, which is aContinuation of U.S. patent application Ser. No. 14/315,676 (nowpatented as U.S. Pat. No. 9,683,257), filed Jun. 26, 2014, which is aContinuation of U.S. patent application Ser. No. 13/006,625, nowabandoned, filed on Jan. 14, 2011, which is a Continuation ofInternational Application PCT/US2009/50566 filed on Jul. 14, 2009, whichclaims the benefit of U.S. Provisional Application No. 61/080,453, filedJul. 14, 2008, the entire disclosures of which are incorporated hereinby reference.

SEQUENCE LISTING

The contents of the electronic Sequence Listing(P109070006US07-SUBSEQ-NTJ.xml; Size: 33,219 bytes; and Date ofCreation: Aug. 20, 2022) are herein incorporated by reference in theirentirety.

FIELD OF THE INVENTION

The invention relates to the field of molecular biology and recombinantnucleic acid technology. In particular, the invention relates to DNAsequences that can be recognized and cleaved by anon-naturally-occurring, rationally-designed, I-CreI-derived homingendonuclease and methods of using same. The invention also relates tomethods of producing recombinant nucleic acids, cells, and organismsusing such meganucleases which cleave such DNA sites. The inventionfurther relates to methods of conducting a custom-designed,I-CreI-derived meganuclease business.

BACKGROUND OF THE INVENTION

Genome engineering requires the ability to insert, delete, substituteand otherwise manipulate specific genetic sequences within a genome, andhas numerous therapeutic and biotechnological applications. Thedevelopment of effective means for genome modification remains a majorgoal in gene therapy, agrotechnology, and synthetic biology (Porteus etal. (2005), Nat. Biotechnol. 23: 967-73; Tzfira et al. (2005), TrendsBiotechnol. 23: 567-9; McDaniel et al. (2005), Curr. Opin. Biotechnol.16: 476-83). A common method for inserting or modifying a DNA sequenceinvolves introducing a transgenic DNA sequence flanked by sequenceshomologous to the genomic target and selecting or screening for asuccessful homologous recombination event. Recombination with thetransgenic DNA occurs rarely, but can be stimulated by a double-strandedbreak in the genomic DNA at the target site. Numerous methods have beenemployed to create DNA double-stranded breaks, including irradiation andchemical treatments. Although these methods efficiently stimulaterecombination, the double-stranded breaks are randomly dispersed in thegenome, which can be highly mutagenic and toxic. At present, theinability to target gene modifications to unique sites within achromosomal background is a major impediment to successful genomeengineering.

One approach to achieving this goal is stimulating homologousrecombination at a double-stranded break in a target locus using anuclease with specificity for a sequence that is sufficiently large tobe present at only a single site within the genome (see, e.g., Porteuset al. (2005), Nat. Biotechnol. 23: 967-73). The effectiveness of thisstrategy has been demonstrated in a variety of organisms using chimericfusions between an engineered zinc finger DNA-binding domain and thenon-specific nuclease domain of the FokI restriction enzyme (Porteus(2006), Mol Ther 13: 438-46; Wright et al. (2005), Plant J. 44: 693-705;Urnov et al. (2005), Nature 435: 646-51). Although these artificial zincfinger nucleases stimulate site-specific recombination, they retainresidual non-specific cleavage activity resulting from under-regulationof the nuclease domain and frequently cleave at unintended sites (Smithet al. (2000), Nucleic Acids Res. 28: 3361-9). Such unintended cleavagecan cause mutations and toxicity in the treated organism (Porteus et al.(2005), Nat. Biotechnol. 23: 967-73).

A group of naturally-occurring nucleases which recognize 15-40 base-paircleavage sites commonly found in the genomes of plants and fungi mayprovide a less toxic genome engineering alternative. Such“meganucleases” or “homing endonucleases” are frequently associated withparasitic DNA elements, such as group 1 self-splicing introns andinteins. They naturally promote homologous recombination or geneinsertion at specific locations in the host genome by producing adouble-stranded break in the chromosome, which recruits the cellularDNA-repair machinery (Stoddard (2006), Q. Rev. Biophys. 38: 49-95).Meganucleases are commonly grouped into four families: the LAGLIDADG(SEQ ID NO: 24) family, the GIY-YIG family, the His-Cys box family andthe HNH family. These families are characterized by structural motifs,which affect catalytic activity and recognition sequence. For instance,members of the LAGLIDADG (SEQ ID NO: 24) family are characterized byhaving either one or two copies of the conserved LAGLIDADG (SEQ ID NO:24) motif (see Chevalier et al. (2001), Nucleic Acids Res. 29(18):3757-3774). The LAGLIDADG (SEQ ID NO: 24) meganucleases with a singlecopy of the LAGLIDADG (SEQ ID NO: 24) motif form homodimers, whereasmembers with two copies of the LAGLIDADG (SEQ ID NO: 24) motif are foundas monomers.

Natural meganucleases, primarily from the LAGLIDADG (SEQ ID NO: 24)family, have been used to effectively promote site-specific genomemodification in plants, yeast, Drosophila, mammalian cells and mice, butthis approach has been limited to the modification of either homologousgenes that conserve the meganuclease recognition sequence (Monnat et al.(1999), Biochem. Biophys. Res. Commun. 255: 88-93) or to pre-engineeredgenomes into which a recognition sequence has been introduced (Rouet etal. (1994), Mol. Cell. Biol. 14: 8096-106; Chilton et al. (2003), PlantPhysiol. 133: 956-65; Puchta et al. (1996), Proc. Natl. Acad. Sci. USA93: 5055-60; Rong et al. (2002), Genes Dev. 16: 1568-81; Gouble et al.(2006), J. Gene Med. 8(5):616-622).

Systematic implementation of nuclease-stimulated gene modificationrequires the use of engineered enzymes with customized specificities totarget DNA breaks to existing sites in a genome and, therefore, therehas been great interest in adapting meganucleases to promote genemodifications at medically or biotechnologically relevant sites (Porteuset al. (2005), Nat. Biotechnol. 23: 967-73; Sussman et al. (2004), J.Mol. Biol. 342: 31-41; Epinat et al. (2003), Nucleic Acids Res. 31:2952-62).

I-CreI (SEQ ID NO: 1) is a member of the LAGLIDADG (SEQ ID NO: 24)family which recognizes and cleaves a 22 base pair recognition sequencein the chloroplast chromosome, and which presents an attractive targetfor meganuclease redesign. Genetic selection techniques have been usedto modify the wild-type I-CreI recognition site preference (Sussman etal. (2004), J. Mol. Biol. 342: 31-41; Chames et al. (2005), NucleicAcids Res. 33: e178; Seligman et al. (2002), Nucleic Acids Res. 30:3870-9, Arnould et al. (2006), J. Mol. Biol. 355: 443-58). Morerecently, a method of rationally-designing mono-LAGLIDADG (SEQ ID NO:24) meganucleases was described which is capable of comprehensivelyredesigning I-CreI and other such meganucleases to targetwidely-divergent DNA sites, including sites in mammalian, yeast, plant,bacterial, and viral genomes (WO 2007/047859).

The DNA sequences recognized by I-CreI are 22 base pairs in length. Oneexample of a naturally-occurring I-CreI recognition site is provided inSEQ ID NO: 2 and SEQ ID NO: 3, but the enzyme will bind to a variety ofrelated sequences with varying affinity. The enzyme binds DNA as ahomodimer in which each monomer makes direct contacts with a nine basepair “half-site” and the two half-sites are separated by four base pairsthat are not directly contacted by the enzyme (FIG. 1 a ). Like allLAGLIDADG (SEQ ID NO: 24) family meganucleases, I-CreI produces astaggered double-strand break at the center of its recognition sequenceswhich results in the production of a four base pair 3′-overhang (FIG. 1a ). The present invention concerns the central four base pairs in theI-CreI recognition sequences (i.e. the four base pairs that become the3′ overhang following I-CreI cleavage, or “center sequence”, FIG. 1 b ).In the case of the native I-CreI recognition sequence in theChlamydomonas reinhardtii 23S rRNA gene, this four base pair sequence is5′-GTGA-3′. In the interest of producing genetically-engineeredmeganucleases which recognize DNA sequences that deviate from thewild-type I-CreI recognition sequences, it is desirable to know theextent to which the four base pair center sequence can deviate from thewild-type sequences. A number of published studies concerning I-CreI orits derivatives evaluated the enzyme, either wild-type orgenetically-engineered, using DNA substrates that employed either thenative 5′-GTGA-3′ central sequence or the palindromic sequence5′-GTAC-3′. Recently, Arnould et. al. (Arnould et al. (2007), J. Mol.Biol. 371: 49-65) reported that a set of genetically-engineeredmeganucleases derived from I-CreI cleaved DNA substrates with varyingefficiencies depending on whether the substrate sequences were centeredaround 5′-GTAC-3′, 5′-TTGA-3′, 5′-GAAA-3′, or 5′-ACAC-3′ (cleavageefficiency: GTAC>ACAC>>TTGA≈GAAA).

SUMMARY OF THE INVENTION

The present invention is based, in part, upon the identification andcharacterization of a subset of DNA recognition sequences that can actas efficient substrates for cleavage by the rationally-designed,I-CreI-derived meganucleases (hereinafter, “I-CreI-derivedmeganucleases”).

In one aspect, the invention provides methods of identifying sets of 22base pair DNA sequences which can be cleaved by I-CreI-derivedmeganucleases and which have, at their center, one of a limited set offour base pair DNA center sequences that contribute to more efficientcleavage by the I-CreI-derived meganucleases. The invention alsoprovides methods that use such DNA sequences to produce recombinantnucleic acids, cells and organisms by utilizing the recognitionsequences as substrates for I-CreI-derived meganucleases, and productsincorporating such DNA sequences.

Thus, in one aspect, the invention provides a method for cleaving adouble-stranded DNA comprising: (a) identifying in the DNA at least onerecognition site for a rationally-designed I CreI-derived meganucleasewith altered specificity relative to I-CreI, wherein the recognitionsite is not cleaved by a naturally-occurring I-CreI, wherein therecognition site has a four base pair central sequence selected from thegroup consisting of TTGT, TTAT, TCTT, TCGT, TCAT, GTTT, GTCT, GGAT,GAGT, GAAT, ATGT, TTTC, TTCC, TGAC, TAAC, GTTC, ATAT, TCGA, TTAA, GGGC,ACGC, CCGC, CTGC, ACAA, ATAA, AAGA, ACGA, ATGA, AAAC, AGAC, ATCC, ACTC,ATTC, ACAT, GAAA, GGAA, GTCA, GTTA, GAAC, ATAT, TCGA, TTAA, GCCC, GCGT,GCGG and GCAG; (b) providing the rationally-designed meganuclease; and(c) contacting the DNA with the rationally-designed meganuclease;whereby the rationally-designed meganuclease cleaves the DNA.

In another aspect, the invention provides a method for cleaving adouble-stranded DNA comprising: (a) introducing into the DNA arecognition site for a rationally-designed I CreI derived meganucleasewith altered specificity relative to I-CreI, wherein the recognitionsite is not cleaved by a naturally-occurring I-CreI, wherein therecognition site has a four base pair central sequence selected from thegroup consisting of TTGT, TTAT, TCTT, TCGT, TCAT, GTTT, GTCT, GGAT,GAGT, GAAT, ATGT, TTTC, TTCC, TGAC, TAAC, GTTC, ATAT, TCGA, TTAA, GGGC,ACGC, CCGC, CTGC, ACAA, ATAA, AAGA, ACGA, ATGA, AAAC, AGAC, ATCC, ACTC,ATTC, ACAT, GAAA, GGAA, GTCA, GTTA, GAAC, ATAT, TCGA, TTAA, GCCC, GCGT,GCGG and GCAG; and (b) providing the rationally-designed meganuclease;and (c) contacting the DNA with the rationally-designed meganuclease;whereby the rationally-designed meganuclease cleaves the DNA.

In some embodiments, the four base pair DNA sequence is selected fromthe group consisting of GTGT, GTAT, TTAG, GTAG, TTAC, TCTC, TCAC, GTCC,GTAC, TCGC, AAGC, GAGC, GCGC, GTGC, TAGC, TTGC, ATGC, ACAC, ATAC, CTAA,CTAC, GTAA, GAGA, GTGA, GGAC, GTAC, GCGA, GCTT, GCTC, GCGC, GCAC, GCTA,GCAA and GCAT.

In some embodiments, the DNA cleavage is in vitro. In other embodiments,the DNA cleavage is in vivo.

In some embodiments, the DNA is selected from the group consisting of aPCR product; an artificial chromosome; genomic DNA isolated frombacteria, fungi, plants, or animal cells; and viral DNA.

In some embodiments, the DNA is present in a cell selected from thegroup consisting of a bacterial, fungal, plant and animal cell.

In some embodiments, the DNA is present in a nucleic acid selected fromthe group consisting of a plasmid, a prophage and a chromosome.

In certain embodiments, the method further comprisesrationally-designing the I CreI derived meganuclease to recognize therecognition site.

In some embodiments, the method further comprises producing therationally-designed I-CreI-derived meganuclease.

In another aspect, the invention provides a cell transformed with anucleic acid comprising, in order: a) a first 9 base pair DNA sequencewhich can be bound by an I CreI derived meganuclease monomer or by afirst domain from a single-chain I CreI derived meganuclease; b) a fourbase pair DNA sequence selected from the group consisting of GTGT, GTAT,TTAG, GTAG, TTAC, TCTC, TCAC, GTCC, GTAC, TCGC, AAGC, GAGC, GCGC, GTGC,TAGC, TTGC, ATGC, ACAC, ATAC, CTAA, CTAC, GTAA, GAGA, GTGA, GGAC, GTAC,GCGA, GCTT, GCTC, GCGC, GCAC, GCTA, GCAA and GCAT; and c) a second 9base pair DNA sequence which can be bound by an I CreI derivedmeganuclease monomer or by a second domain from the single-chain I CreIderived meganuclease, wherein the second 9 base pair DNA sequence is inthe reverse orientation relative to the first.

In yet another aspect, the invention provides a cell containing anexogenous nucleic acid sequence integrated into its genome, comprising,in order: a) a first exogenous 9 base pair DNA sequence which can bebound by an I CreI derived meganuclease monomer or by a first domainfrom a single-chain I CreI derived meganuclease; b) an exogenous fourbase pair DNA sequence selected from the group consisting of GTGT, GTAT,TTAG, GTAG, TTAC, TCTC, TCAC, GTCC, GTAC, TCGC, AAGC, GAGC, GCGC, GTGC,TAGC, TTGC, ATGC, ACAC, ATAC, CTAA, CTAC, GTAA, GAGA, GTGA, GGAC, GTAC,GCGA, GCTT, GCTC, GCGC, GCAC, GCTA, GCAA and GCAT; and a) a secondexogenous 9 base pair DNA sequence which can be bound by an I CreIderived meganuclease monomer or by a second domain from the single-chainI CreI derived meganuclease, wherein the second 9 base pair DNA sequenceis in the reverse orientation relative to the first.

In some embodiments, the nucleic acid is a plasmid, an artificialchromosome, or a viral nucleic acid.

In some embodiments, the cell is a non-human animal cell, a plant cell,a bacterial cell, or a fungal cell.

In some embodiments, the four base pair DNA sequence is TTGT, TTAT,TCTT, TCGT, TCAT, GTTT, GTCT, GGAT, GAGT, GAAT, ATGT, TTTC, TTCC, TGAC,TAAC, GTTC, ATAT, TCGA, TTAA, GGGC, ACGC, CCGC, CTGC, ACAA, ATAA, AAGA,ACGA, ATGA, AAAC, AGAC, ATCC, ACTC, ATTC, ACAT, GAAA, GGAA, GTCA, GTTA,GAAC, ATAT, TCGA, TTAA, GCCC, GCGT, GCGG or GCAG.

In some embodiments, the four base pair DNA sequence is GTGT, GTAT,TTAG, GTAG, TTAC, TCTC, TCAC, GTCC, GTAC, TCGC, AAGC, GAGC, GCGC, GTGC,TAGC, TTGC, ATGC, ACAC, ATAC, CTAA, CTAC, GTAA, GAGA, GTGA, GGAC, GTAC,GCGA, GCTT, GCTC, GCGC, GCAC, GCTA, GCAA or GCAT.

In yet another aspect, the invention provides a method of conducting acustom-designed, I-CreI-derived meganuclease business comprising: (a)receiving a DNA sequence into which a double-strand break is to beintroduced by a rationally-designed I CreI-derived meganuclease; (b)identifying in the DNA sequence at least one recognition site for arationally-designed I CreI-derived meganuclease with altered specificityrelative to I-CreI, wherein the recognition site is not cleaved by anaturally-occurring I-CreI, wherein the recognition site has a four basepair central sequence selected from the group consisting of TTGT, TTAT,TCTT, TCGT, TCAT, GTTT, GTCT, GGAT, GAGT, GAAT, ATGT, TTTC, TTCC, TGAC,TAAC, GTTC, ATAT, TCGA, TTAA, GGGC, ACGC, CCGC, CTGC, ACAA, ATAA, AAGA,ACGA, ATGA, AAAC, AGAC, ATCC, ACTC, ATTC, ACAT, GAAA, GGAA, GTCA, GTTA,GAAC, ATAT, TCGA, TTAA, GCCC, GCGT, GCGG and GCAG; and (c) providing therationally-designed meganuclease.

In some embodiments, the method further comprises rationally-designingthe I CreI derived meganuclease to recognize the recognition site.

In some embodiments, the method further comprises producing therationally-designed meganuclease.

In some embodiments, the rationally-designed meganuclease is provided tothe same party from which the DNA sequence has been received.

These and other aspects and embodiments of the invention will beapparent to one of ordinary skill in the art from the following detaileddescription of the invention, figures and appended claims.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A Schematic illustration of the interactions between thenaturally-occurring I-CreI homodimer and a double-stranded recognitionsequence, based upon crystallographic data. This schematicrepresentation depicts one recognition sequence (SEQ ID NOS 3 and 2,respectively, in order of appearance), shown as unwound for illustrationpurposes only, bound by the homodimer, shown as two ovals. The bases ofeach DNA half-site are numbered −1 through −9, and the amino acidresidues of I-CreI which form the recognition surface are indicated byone-letter amino acid designations and numbers indicating residueposition. The four base pairs that comprise the center sequence arenumbered +1 to +4. Solid black lines: hydrogen bonds to DNA bases. FIG.1B One wild-type I-CreI recognition sequence (SEQ ID NOS 3 and 2,respectively, in order of appearance) showing the locations of theinverted half-sites and center sequence.

FIG. 2A Schematic diagram of the plasmid substrates evaluated todetermine center sequence preference (SEQ ID NOS 22-23, respectively, inorder of appearance). A set of pUC-19 plasmids were produced whichharbored potential recognition sequences for the genetically-engineeredmeganuclease DJ1. These potential recognition sequences comprised a pairof inverted DJ1 half-sites separated by a variety of different four basepair center sequences (numbered +1 through +4), as described below. FIG.2B Example of gel electrophoresis data showing DJ1 meganuclease cleavageof plasmid substrates described in FIG. 2A. The “uncut” arrow indicatesXmnI linearized plasmid substrate. The “cut” arrows indicate XmnIlinearized plasmid substrates which have also been successfully cleavedby DJ1.

FIG. 3A Schematic diagram of a T-DNA that was stably integrated into theArabidopsis thaliana genome as described in Example 1. In this T-DNAconstruct, a codon-optimized gene encoding the genetically-engineeredBRP2 meganuclease (BRP2) (SEQ ID NO: 8) is under the control of a Hsp70promoter (HSP) and a NOS terminator (TERM). A pair of potential BRP2recognition sequences (Site1, Site2) are housed adjacent to theterminator separated by 7 base pairs containing a PstI restrictionenzyme site (PstI). A kanamycin resistance marker (Kan) is also housedon the T-DNA to allow selection for stable transformants. FIG. 3B Theexpected product following BRP2 meganuclease cleavage of Site1 and Site2showing loss of the intervening 7 base pair fragment and PstIrestriction site. Arrows show the location of PCR primers used to screenfor cleavage of the T-DNA. FIG. 3C Sequences of the BRP2 recognitionsequences housed on either the GTAC construct (SEQ ID NO: 9) (GTAC) orthe TAGA construct (SEQ ID NO: 11) (TAGA) with center sequencesunderlined. FIG. 3D Example of electrophoresis data from a planttransformed with the GTAC construct. Genomic DNA was isolated from theleaves of Arabidopsis seedlings stably transformed with either the GTACT-DNA construct before and after a 2 hour “heat-shock” to induce BRP2expression. DNA samples were then added to PCR reactions using theprimers shown in FIG. 3B. PCR reactions were digested with PstI andvisualized by gel electrophoresis. C: control lane lacking PstI. 44, 45,and 46: PCR samples from three representative plants showing nearlycomplete digestion by PstI in samples taken prior to heat shock (−lanes)and very little digestion by PstI in samples taken after heat-shock(+lanes). These results indicate that the BRP2 meganuclease was able tocleave the BRP2 recognition sequence which incorporated a GTAC centersequence in vivo.

FIG. 4A Schematic diagram of a T-DNA that was stably integrated into theArabidopsis thaliana genome as described in Example 2. In this T-DNAconstruct, a codon-optimized gene encoding the BRP12-SC meganuclease(BRP12-SC) (SEQ ID NO: 15) is under the control of a Hsp70 promoter(HSP) and a NOS terminator (TERM). A pair of potential BRP12-SCrecognition sequences (Site1, Site2) are housed adjacent to theterminator separated by 7 base pairs containing a PstI restrictionenzyme site (PstI). A kanamycin resistance marker (Kan) is also housedon the T-DNA to allow selection for stable transformants. FIG. 4B Theexpected product following BRP12-SC meganuclease cleavage of Site1 andSite2 showing loss of the intervening 7 base pair fragment and PstIrestriction site. Arrows show the location of PCR primers used to screenfor cleavage of the T-DNA. FIG. 4C Sequences of the BRP12-SC recognitionsequences housed on either the GTAC construct (SEQ ID NO: 16) (GTAC) orthe TAGA construct (SEQ ID NO: 18) (TAGA) with center sequencesunderlined.

FIG. 5 . Graphic representation of the effects of meganucleaseconcentration and center sequence on in vitro meganuclease cleavage. TheBRP2 meganuclease (SEQ ID NO: 8, see Example 1) was added at theindicated concentration to a digest reaction containing 0.25 picomolesof a plasmid substrate harboring either a BRP2 recognition sequence withthe center sequence GTAC or a BRP2 recognition sequence with the centersequence TAGA. Reactions were 25 microliters in SA buffer (25 mMTris-HCL, pH 8.0, 100 mM NaCl, 5 mM MgCl₂, 5 mM EDTA). Reactions wereincubated at 37° C. for 2 hours and were then visualized by gelelectrophoresis and the percent of plasmid substrate cleaved by themeganuclease was plotted as a function of meganuclease concentration.

DETAILED DESCRIPTION OF THE INVENTION 1.1 Introduction

The present invention is based, in part, upon the identification andcharacterization of particular DNA sequences that are more efficientlycleaved by the rationally-designed, I-CreI-derived meganucleases.Specifically, the invention is based on the discovery that certainfour-base pair DNA sequences, when incorporated as the central four-basepairs of a rationally-designed, I-CreI-derived meganuclease recognitionsequence, can significantly impact cleavage by the correspondingmeganuclease although the meganuclease does not, based upon analysis ofcrystal structures, appear to contact the central four base pairs. Asthere are four DNA bases (A, C, G, and T), there are 4⁴ or 256 possibleDNA sequences that are four base pairs in length. All of these possiblesequences were examined to determine the subsets of sequences that aremore efficiently cleaved by I-CreI-derived meganucleases. The results ofthis analysis allow for more accurate prediction of whether or not aparticular double-stranded DNA site 22 base pairs in length can be moreefficiently cleaved by the I-CreI-derived meganuclease.

1.2 References and Definitions

The patent and scientific literature referred to herein establishesknowledge that is available to those of skill in the art. The issuedU.S. patents, allowed applications, published foreign applications, andreferences, including GenBank database sequences, that are cited hereinare hereby incorporated by reference to the same extent as if each wasspecifically and individually indicated to be incorporated by reference.

As used herein, the term “I-CreI-derived meganuclease” refers to arationally-designed (i.e., genetically-engineered) meganuclease that isderived from I-CreI. The term genetically-engineered meganuclease, asused herein, refers to a recombinant variant of an I-CreI homingendonuclease that has been modified by one or more amino acidinsertions, deletions or substitutions that affect one or more ofDNA-binding specificity, DNA cleavage activity, DNA-binding affinity,and/or dimerization properties. Some genetically-engineeredmeganucleases are known in the art (see, e.g., Porteus et al. (2005),Nat. Biotechnol. 23: 967-73; Sussman et al. (2004), J. Mol. Biol. 342:31-41; Epinat et al. (2003), Nucleic Acids Res. 31: 2952-62) and ageneral method for rationally-designing such variants is disclosed in WO2007/047859. Additional methods for genetically-engineering suchvariants are disclosed in WO 04/067736, WO 07/060495, WO 06/097853, WO07/049095, WO 08/102198, WO 08/010093, WO 08/010009, WO 07/093918, WO07/093836, WO 08/102274, WO 08/059317, WO 09/013622, WO 09/019614, WO09/019528, WO 08/152523, WO 04/067753, WO 03/078619, WO 06/097784, WO07/034262, WO 07/049156, WO 07/057781, WO 08/093152, WO 08/102199, WO08/102274, WO 08/149176, WO 09/013559, WO 09/013622, and WO 09/019528.

A meganuclease may bind to double-stranded DNA as a homodimer, as is thecase for wild-type I-CreI, or it may bind to DNA as a heterodimer. Ameganuclease may also be a “single-chain heterodimer” in which a pair ofDNA-binding domains derived from I-CreI are joined into a singlepolypeptide using a peptide linker. The term “homing endonuclease” issynonymous with the term “meganuclease.”

As used herein, the term “rationally-designed” means non-naturallyoccurring and/or genetically engineered. The rationally-designedmeganucleases of the invention differ from wild-type ornaturally-occurring meganucleases in their amino acid sequence orprimary structure, and may also differ in their secondary, tertiary orquaternary structure. In addition, the rationally-designed meganucleasesof the invention also differ from wild-type or naturally-occurringmeganucleases in recognition sequence-specificity and/or activity.

As used herein, with respect to a protein, the term “recombinant” meanshaving an altered amino acid sequence as a result of the application ofgenetic engineering techniques to nucleic acids which encode theprotein, and cells or organisms which express the protein. With respectto a nucleic acid, the term “recombinant” means having an alterednucleic acid sequence as a result of the application of geneticengineering techniques. Genetic engineering techniques include, but arenot limited to, PCR and DNA cloning technologies; transfection,transformation and other gene transfer technologies; homologousrecombination; site-directed mutagenesis; and gene fusion. In accordancewith this definition, a protein having an amino acid sequence identicalto a naturally-occurring protein, but produced by cloning and expressionin a heterologous host, is not considered recombinant.

As used herein, the term “genetically-modified” refers to a cell ororganism in which, or in an ancestor of which, a genomic DNA sequencehas been deliberately modified by recombinant technology. As usedherein, the term “genetically-modified” encompasses the term“transgenic.”

As used herein, the term “wild-type” refers to any naturally-occurringform of a meganuclease. The term “wild-type” is not intended to mean themost common allelic variant of the enzyme in nature but, rather, anyallelic variant found in nature. Wild-type meganucleases aredistinguished from recombinant or non-naturally-occurring meganucleases.

As used herein, the term “recognition sequence half-site” or simply“half site” means a 9 base pair DNA sequence which is recognized by ameganuclease monomer, in the case of a dimeric meganuclease, or by onedomain of a single-chain meganuclease.

As used herein, the term “recognition sequence” refers to a pair ofhalf-sites which is bound and cleaved by a meganuclease. A recognitionsequence comprises a pair of inverted, 9 base pair half sites separatedby four base-pairs. The recognition sequence is, therefore, 22base-pairs in length. The base pairs of each half-site are designated −9through −1, with the −9 position being most distal from the cleavagesite and the −1 position being adjacent to the 4 base pair centersequence, the base pairs of which are designated +1 through +4. Thestrand of each half-site which is oriented 5′ to 3′ in the directionfrom −9 to −1 (i.e., towards the cleavage site), is designated the“sense” strand, and the opposite strand is designated the “antisensestrand”, although neither strand may encode protein. Thus, the “sense”strand of one half-site is the antisense (opposite) strand of the otherhalf-site. See, for example, FIG. 1 a.

As used herein, the term “center sequence” refers to the four base pairsseparating half sites in the meganuclease recognition sequence. Thesebases are numbered +1 through +4 in FIG. 1 a . The center sequencecomprises the four bases that become the 3′ single-strand overhangsfollowing meganuclease cleavage. “Center sequence” can refer to thesequence of the sense strand or the antisense (opposite) strand.

As used herein, the term “specificity” refers to the ability of ameganuclease to recognize and cleave double-stranded DNA molecules onlyat a particular subset of all possible recognition sequences. The set ofrecognition sequences will share certain conserved positions or sequencemotifs, but may be degenerate at one or more positions. A more specificmeganuclease is capable of binding and cleaving a smaller subset of thepossible recognition sequences, whereas a less specific meganuclease iscapable of binding and cleaving a larger subset of the possiblerecognition sequences.

As used herein, the term “palindromic” refers to a recognition sequenceconsisting of inverted repeats of identical half-sites. In this case,however, the palindromic sequence need not be palindromic with respectto the center sequence, which is not contacted by the enzyme. In thecase of dimeric meganucleases, palindromic DNA sequences are recognizedby homodimers in which the two monomers make contacts with identicalhalf-sites.

As used herein, the term “pseudo-palindromic” refers to a recognitionsequence consisting of inverted repeats of non-identical or imperfectlypalindromic half-sites. In this case, the pseudo-palindromic sequenceneed not be palindromic with respect to the center sequence, and alsocan deviate from a perfectly palindromic sequence between the twohalf-sites. Pseudo-palindromic DNA sequences are typical of the naturalDNA sites recognized by wild-type homodimeric meganucleases in which twoidentical enzyme monomers make contacts with different half-sites.

As used herein, the term “non-palindromic” refers to a recognitionsequence composed of two unrelated half-sites of a meganuclease. In thiscase, the non-palindromic sequence need not be palindromic with respectto either the center sequence or the two monomer half-sites.Non-palindromic DNA sequences are recognized by either heterodimericmeganucleases or single-chain meganucleases comprising a pair of domainsthat recognize non-identical half-sites.

As used herein, the term “activity” refers to the rate at which ameganuclease of the invention cleaves a particular recognition sequence.Such activity is a measurable enzymatic reaction, involving thehydrolysis of phosphodiester bonds of double-stranded DNA. The activityof a meganuclease acting on a particular DNA substrate is affected bythe affinity or avidity of the meganuclease for that particular DNAsubstrate which is, in turn, affected by both sequence-specific andnon-sequence-specific interactions with the DNA.

As used herein, the term “homologous recombination” refers to thenatural, cellular process in which a double-stranded DNA-break isrepaired using a homologous DNA sequence as the repair template (see,e.g., Cahill et al. (2006), Front. Biosci. 11:1958-1976). The homologousDNA sequence may be an endogenous chromosomal sequence or an exogenousnucleic acid that was delivered to the cell. Thus, for someapplications, a meganuclease is used to cleave a recognition sequencewithin a target sequence in a genome and an exogenous nucleic acid withhomology to or substantial sequence similarity with the target sequenceis delivered into the cell and used as a template for repair byhomologous recombination. The DNA sequence of the exogenous nucleicacid, which may differ significantly from the target sequence, isthereby incorporated into the chromosomal sequence. The process ofhomologous recombination occurs primarily in eukaryotic organisms. Theterm “homology” is used herein as equivalent to “sequence similarity”and is not intended to require identity by descent or phylogeneticrelatedness.

As used herein, the term “non-homologous end-joining” refers to thenatural, cellular process in which a double-stranded DNA-break isrepaired by the direct joining of two non-homologous DNA segments (see,e.g., Cahill et al. (2006), Front. Biosci. 11:1958-1976). DNA repair bynon-homologous end-joining is error-prone and frequently results in theuntemplated addition or deletion of DNA sequences at the site of repair.Thus, for some applications, a meganuclease can be used to produce adouble-stranded break at a meganuclease recognition sequence within atarget sequence in a genome to disrupt a gene (e.g., by introducing baseinsertions, base deletions, or frameshift mutations) by non-homologousend-joining. For other applications, an exogenous nucleic acid lackinghomology to or substantial sequence similarity with the target sequencemay be captured at the site of a meganuclease-stimulated double-strandedDNA break by non-homologous end-joining (see, e.g., Salomon et al.(1998), EMBO J. 17:6086-6095). The process of non-homologous end-joiningoccurs in both eukaryotes and prokaryotes such as bacteria.

As used herein, the term “sequence of interest” means any nucleic acidsequence, whether it codes for a protein, RNA, or regulatory element(e.g., an enhancer, silencer, or promoter sequence), that can beinserted into a genome or used to replace a genomic DNA sequence using ameganuclease protein. Sequences of interest can have heterologous DNAsequences that allow for tagging a protein or RNA that is expressed fromthe sequence of interest. For instance, a protein can be tagged withtags including, but not limited to, an epitope (e.g., c-myc, FLAG) orother ligand (e.g., poly-His). Furthermore, a sequence of interest canencode a fusion protein, according to techniques known in the art (see,e.g., Ausubel et al., Current Protocols in Molecular Biology, Wiley1999). For some applications, the sequence of interest is flanked by aDNA sequence that is recognized by the meganuclease for cleavage. Thus,the flanking sequences are cleaved allowing for proper insertion of thesequence of interest into genomic recognition sequences cleaved by ameganuclease. For some applications, the entire sequence of interest ishomologous to or has substantial sequence similarity with a targetsequence in the genome such that homologous recombination effectivelyreplaces the target sequence with the sequence of interest. For otherapplications, the sequence of interest is flanked by DNA sequences withhomology to or substantial sequence similarity with the target sequencesuch that homologous recombination inserts the sequence of interestwithin the genome at the locus of the target sequence. For someapplications, the sequence of interest is substantially identical to thetarget sequence except for mutations or other modifications in themeganuclease recognition sequence such that the meganuclease can notcleave the target sequence after it has been modified by the sequence ofinterest.

As used herein, the term “single-chain meganuclease” refers to apolypeptide comprising a pair of meganuclease subunits joined by alinker. A single-chain meganuclease has the organization: N-terminalsubunit-Linker-C-terminal subunit. The two meganuclease subunits, eachof which is derived from I-CreI, will generally be non-identical inamino acid sequence and will recognize non-identical half-sites. Thus,single-chain meganucleases typically cleave pseudo-palindromic ornon-palindromic recognition sequences. A single chain meganuclease maybe referred to as a “single-chain heterodimer” or “single-chainheterodimeric meganuclease” although it is not, in fact, dimeric.

As used herein, unless specifically indicated otherwise, the word “or”is used in the inclusive sense of “and/or” and not the exclusive senseof “either/or.”

2.1 Preferred Center Sequences for I-CreI-Derived Meganucleases

The present invention is based, in part, in the identification ofsubsets of the possible four base pair center sequences that arepreferred by I-CreI-derived meganucleases. As the wild-type enzyme doesnot make significant contacts with the bases in the center sequence, thesame center sequence preferences of the wild-type I-CreI homing nucleaseapply to rationally-designed I-CreI-derived meganucleases which havebeen redesigned with respect to, for example, half-site preference,DNA-binding affinity, and/or heterodimerization ability. This inventionprovides, therefore, important criteria that can be considered indetermining whether or not a particular 22 base pair DNA sequence is asuitable I-CreI-derived meganuclease recognition sequence.

The preferred set of center sequences was determined using agenetically-engineered meganuclease called “DJ1” (SEQ ID NO: 4). Theproduction of this meganuclease is described in WO 2007/047859. DJ1 is ahomodimeric I-CreI-derived meganuclease which was designed to recognizea palindromic meganuclease recognition sequence (SEQ ID NO: 5, SEQ IDNO: 6) that differs at 4 positions per half-site relative to wild-typeI-CreI. This change in half-site specificity was achieved by theintroduction of 6 amino acid substitutions to wild type I-CreI (K28D,N30R, S32N, Q38E, S40R, and T42R).

To test for cleavage activity with respect to various recognitionsequences, DJ1 was expressed in E. coli and purified as described inExample 1 of WO 2007/047859. Then, 25 picomoles of purified meganucleaseprotein were added to a 10 nM solution of plasmid DNA substrate in SAbuffer (25 mM Tris-HCL, pH 8.0, 100 mM NaCl, 5 mM MgCl₂, 5 mM EDTA) in a25 microliter reaction. 1 microliter of XmnI restriction enzyme wasadded to linearize the plasmid substrates. Reactions were incubated at37° C. for 4 hours and were then visualized by gel electrophoresis todetermine the extent to which each was cleaved by the DJ1 meganuclease.

The plasmid substrates used in these experiments comprised a pUC-19plasmid in which a potential meganuclease recognition sequence wasinserted into the polylinker site (SmaI site). Each potentialmeganuclease recognition site comprised a pair of inverted DJ1half-sites (SEQ ID NO: 7) separated by a different center sequence.Thus, by evaluating DJ1 cleavage of multiple DNA substrates differingonly by center sequence, it was possible to determine which centersequences are the most amenable to meganuclease cleavage (FIG. 2 ).

Initially, only the influence of the N₊₂ and N₊₃ bases were evaluated.The X-ray crystal structure of I-CreI in complex with its natural DNAsite shows that the DNA is distorted at these central two base pairs(Jurica et al. (1998), Mol Cell. 2:469-76). Computer modeling suggeststhat a purine (G or A) at N₊₂ is incompatible with a pyrimidine (C or T)at N₊₃. This is because the distortion introduced by I-CreI bindingcauses a steric clash between a purine base at N₊₂ and a second purinebase-paired to a pyrimidine at N₊₃. This expected incompatibility wasverified experimentally by incubating DJ1 protein with plasmidsubstrates harboring meganuclease recognition sites with all possiblecenter sequences of the form A₊₁X₊₂X₊₃T₊₄ in which X is any base. Theresults are summarized in Table 1. For Tables 1-5, “Activity” refers tothe following:

-   -   −: no cleavage in 4 hours    -   +: 1%-25% cleavage in 4 hours    -   ++: 26%-75% cleavage in 4 hours    -   +++: 75%-100% cleavage in 4 hours

TABLE 1 The effect of changes at N₊₂ and N₊₃ Seq. No. N₊₁ N₊₂ N₊₃ N₊₄Activity 1 A A A T + 2 A A C T − 3 A A G T + 4 A A T T − 5 A C A T + 6 AC C T + 7 A C G T + 8 A C T T + 9 A G A T + 10 A G C T − 11 A G G T + 12A G T T − 13 A T A T ++ 14 A T C T + 15 A T G T ++ 16 A T T T +

Consistent with the computer modeling, it was found that the fourplasmid substrates with a purine base at N₊₂ and a pyrimidine base atN₊₃ (sequence numbers 2, 4, 10, and 12) were not cut efficiently by DJ1.

Next, a more comprehensive evaluation of center sequence preference wasperformed. There are 4⁴ or 256 possible center sequences. Of these, 25%,or 64, have a purine base at N₊₂ and pyrimidine at N₊₃ and, therefore,were eliminated as center sequences based on the experiment describedabove. Of the remaining 192, 92 are redundant because meganucleases aresymmetric and recognize bases equally on both the sense and antisensestrand. For example, the sequence A₊₁A₊₂A₊₃A₊₄ on the sense strand isrecognized by the meganuclease as T₊₁T₊₂T₊₃T₊₄ on the antisense strandand, thus, A₊₁A₊₂A₊₃A₊₄ and T₊₁T₊₂T₊₃T₊₄ are functionally equivalent.Taking these redundancies into account, as well as the aforementionedN₊₂/N₊₃ conflicts, there were 100 possible center sequences remaining.To determine which of these were preferred by meganucleases, we produced100 plasmid substrates harboring these 100 center sequences flanked byinverted recognition half-sites for the DJ1 meganuclease. DJ1 was thenincubated with each of the 100 plasmids and cleavage activity wasevaluated as described above. These results are summarized in Table 2.

TABLE 2 Cleavable Center Sequences Seq. No. N₊₁ N₊₂ N₊₃ N₊₄ Activity 1 TT T T + 2 T T G T ++ 3 T T C T + 4 T T A T ++ 5 T G G T + 6 T G A T + 7T C T T ++ 8 T C G T ++ 9 T C C T + 10 T C A T ++ 11 T A G T + 12 T A AT + 13 G T T T ++ 14 G T G T +++ 15 G T C T ++ 16 G T A T +++ 17 G G GT + 18 G G A T ++ 19 G A G T ++ 20 G A A T ++ 21 C T T T + 22 C T G T +23 C T C T + 24 C T A T + 25 C G G T + 26 C G A T + 27 C C T T + 28 C CG T + 29 C C C T + 30 C C A T + 31 C A G T + 32 C A A T + 33 A T T T +34 A T G T ++ 35 A T C T + 36 A G G T + 37 A C T T + 38 T T T G + 39 T TG G + 40 T T C G + 41 T T A G +++ 42 T G G G + 43 T G A G + 44 T C T G +45 T C G G + 46 T C C G + 47 T C A G + 48 T A G G + 49 T A A G + 50 G TT G + 51 G T G G + 52 G T C G + 53 G T A G +++ 54 G G G G + 55 G G A G +56 G A G G + 57 G A A G + 58 C T T G + 59 C T G G + 60 C T C G + 61 C GG G + 62 C C T G + 63 T T T C ++ 64 T T C C ++ 65 T T A C +++ 66 T G A C++ 67 T C T C +++ 68 T C C C + 69 T C A C +++ 70 T A A C ++ 71 G T T C++ 72 G T C C +++ 73 T T T A + 74 T T G A + 75 T T C A + 76 T G G A + 77T C T A + 78 A T A T ++ 79 A C G T + 80 C T A G + 81 C C G G + 82 G T AC +++ 83 T C G A ++ 84 T T A A ++ 85 T C G C +++ 86 A A G C +++ 87 G A GC +++ 88 G C G C +++ 89 G G G C ++ 90 G T G C +++ 91 T A G C +++ 92 T GG C + 93 T T G C +++ 94 A C G C ++ 95 A G G C + 96 A T G C +++ 97 C A GC + 98 C C G C ++ 99 C G G C + 100 C T G C ++

For clarity, each of the center sequences listed in Table 2 isequivalent to its opposite strand sequence due to the fact that theI-CreI meganuclease binds its recognition sequence as a symmetrichomodimer. Thus, sequence no. 100 in Table 2, C₊₁T₊₂ G₊₃ C₊₄, isequivalent to its opposite strand sequence, G₊₁C₊₂A₊₃G₊₄. From thesedata, a general set of center sequence preference rules emerge. Theserules, which are not meant to supersede Table 1 or Table 2, include:

-   -   1. Center sequences with a purine base at N₊₂ and a pyrimidine        base at N₊₃ cut very poorly, if at all.    -   2. G is preferred at N₊₁. This is equivalent to C at N₊₄. All of        the most preferred center sequences have G at N₊₁ and/or C at        N₊₄.    -   3. C is preferred at N₊₂. This is equivalent to G at N₊₃.    -   4. There is a preference for center sequences with a pyrimidine        base at N₊₂ and a purine base at N₊₃.    -   5. There is a preference for sequences with at least 1 A-T base        pair in the center sequence.

Thus, in general, preferred center sequences have the form G₊₁Y₊₂R₊₃X₊₄where Y is a pyrimidine (C or T), R is a purine (A or G), and X is anybase (A, C, G, or T).

2.2 In Vitro Applications Using Preferred Center Sequences.

Genetically-engineered meganucleases have numerous potential in vitroapplications including restriction mapping and cloning. Theseapplications are known in the art and are discussed in WO 2007/047859.

One advantage of using genetically-engineered meganucleases rather thanconventional restriction enzymes for applications such as cloning is thepossibility of cutting DNA to leave a wide range of different 3′overhangs (“sticky ends”) that are compatible with, for example, the 3′overhangs produced by cleaving a particular vector of interest. Thus,there are occasions when it is desirable to cleave a meganucleaserecognition sequence with a sub-optimal center sequence in order tocreate a desired overhang.

Because in vitro DNA cleavage conditions are, in general, less stringentthan conditions in vivo, the use of sub-optimal center sequences may beacceptable for such applications. For example, relative to in vivoapplications, in vitro digests using engineered meganucleases can beperformed at a higher ratio of meganuclease to DNA, there is typicallyless non-specific (genomic) DNA competing for meganuclease, and solutionconditions can be optimized to favor meganuclease cleavage (e.g., usingSA buffer as described above). Thus, a larger number of center sequencesare suitable for in vitro applications than for in vivo applications.All of the center sequences listed in Table 2 are suitable for in vitroapplications, but preferred and most preferred center sequences for invitro applications are listed in Table 3 and Table 4, respectively, withtheir opposite strand sequences.

TABLE 3 Preferred Center Sequences for in vitro ApplicationsOpposite Strand Seq. No. N₊₁N₊₂N₊₃N₊₄ Sequence 1 TTGT ACAA 2 TTAT ATAA 3TCTT AAGA 4 TCGT ACGA 5 TCAT ATGA 6 GTTT AAAC 7 GTCT AGAC 8 GGAT ATCC 9GAGT ACTC 10 GAAT ATTC 11 ATGT ACAT 12 TTTC GAAA 13 TTCC GGAA 14 TGACGTCA 15 TAAC GTTA 16 GTTC GAAC 17 ATAT ATAT 18 TCGA TCGA 19 TTAA TTAA 20GGGC GCCC 21 ACGC GCGT 22 CCGC GCGG 23 CTGC GCAG

TABLE 4 Most Preferred Center Sequences for in vitro ApplicationsOpposite Strand Seq. No. N₁N₂N₃N₄ Sequence 1 GTGT ACAC 2 GTAT ATAC 3TTAG CTAA 4 GTAG CTAC 5 TTAC GTAA 6 TCTC GAGA 7 TCAC GTGA 8 GTCC GGAC 9GTAC GTAC 10 TCGC GCGA 11 AAGC GCTT 12 GAGC GCTC 13 GCGC GCGC 14 GTGCGCAC 15 TAGC GCTA 16 TTGC GCAA 17 ATGC GCAT

Obviously, not every 22 base pair DNA sequence having a preferred ormost preferred center sequence is capable of being a meganucleaserecognition sequence in vitro. The sequence of the half-sites flankingthe center sequence must also be amenable to meganuclease recognitionand cleavage. Methods for engineering a meganuclease including I-CreI,to recognize a pre-determined half-site are known in the art (see, e.g.,WO 2007/047859). Thus, a preferred I-CreI-derived meganucleaserecognition sequence for in vitro applications will comprise: (1) afirst 9 base pair half-site amenable to recognition by a meganucleasemonomer (or a first domain of a single-chain meganuclease); (2) apreferred or most preferred center sequence from Table 2 or Table 3; and(3) a second 9 base pair half-site amenable to recognition by ameganuclease monomer (or a second domain of a single-chain meganuclease)in the reverse orientation relative to the first half-site.

Thus, in one aspect, the invention provides methods for cleaving adouble-stranded DNA in vitro by (a) identifying at least one potentialrecognition site for at least one I-CreI-derived meganuclease within theDNA, wherein the potential recognition site has a four base pair centralsequence selected from the group of central sequences of Table 2; (b)identifying an I-CreI-derived meganuclease which recognizes thatrecognition site in the DNA; and (c) contacting the I-CreI-derivedmeganuclease with the DNA; whereby the I-CreI meganuclease cleaves theDNA.

In another aspect, the invention provides methods for cleaving adouble-stranded DNA in vitro by (a) introducing into the DNA arecognition site for an I-CreI-derived meganuclease having a four basepair central sequence selected from the group consisting of centralsequences of Table 2; and (b) contacting the I-CreI-derived meganucleasewith the DNA; whereby the I-CreI-derived meganuclease cleaves the DNA.

In particular, in some embodiments, the DNA is selected from a PCRproduct; an artificial chromosome; genomic DNA isolated from bacteria,fungi, plants, or animal cells; and viral DNA.

In some embodiments, the DNA is present in a nucleic acid selected froma plasmid, a prophage and a chromosome.

In some of the foregoing embodiments, the four base pair DNA sequence isselected from Table 3. In other embodiments, the four base pair DNAsequence is selected from Table 4.

In some embodiments, the I-CreI-derived meganuclease can be specificallydesigned for use with the chosen recognition site in the method.

2.3 In Vivo Applications Using Preferred Center Sequences.

Applications such as gene therapy, cell engineering, and plantengineering require meganuclease function inside of a living cell (forclarity, any intracellular application will be referred to as an “invivo” application whether or not such cell is isolated or part of amulticellular organism). These applications are known in the art and aredescribed in, e.g., WO 2007/047859. In vivo applications aresignificantly restricted relative to in vitro applications with regardto the center sequence. This is because intracellular conditions cannotbe manipulated to any great extent to favor meganuclease activity and/orbecause vast amounts of genomic DNA compete for meganuclease binding.Thus, only meganuclease recognition sequences with optimal centersequences are preferred for in vivo applications. Such sequences arelisted in Table 5 with their opposite strand sequences.

TABLE 5 Preferred Center Sequences for in vivo applications.Opposite Strand Seq. No. N₁N₂N₃N₄ Sequence 1 GTGT ACAC 2 GTAT ATAC 3TTAG CTAA 4 GTAG CTAC 5 TTAC GTAA 6 TCTC GAGA 7 TCAC GTGA 8 GTCC GGAC 9GTAC GTAC 10 TCGC GCGA 11 AAGC GCTT 12 GAGC GCTC 13 GCGC GCGC 14 GTGCGCAC 15 TAGC GCTA 16 TTGC GCAA 17 ATGC GCAT

Obviously, not every 22 base pair DNA sequence having a preferred centersequence is capable of being a meganuclease recognition sequence invivo. The sequence of the half-sites flanking the center sequence mustalso be amenable to meganuclease recognition and cleavage. Methods forengineering a meganuclease, including I-CreI, to recognize apre-determined half-site are known in the art (see, e.g., WO2007/047859). Thus, a preferred in vivo meganuclease recognitionsequence will comprise: (1) a first 9 base pair half-site amenable torecognition by a meganuclease monomer (or a first domain of asingle-chain meganuclease); (2) a preferred center sequence from Table5; and (3) a second 9 base pair half-site amenable to recognition by ameganuclease monomer (or a second domain of a single-chain meganuclease)in the reverse orientation relative to the first half-site.

Thus, in one aspect, the invention provides methods for cleaving adouble-stranded DNA in vivo by (a) identifying at least one potentialrecognition site for at least one I-CreI-derived meganuclease within theDNA, wherein the potential recognition site has a four base pair centralsequence selected from the group of central sequences of Table 2; (b)identifying an I-CreI-derived meganuclease which recognizes thatrecognition site in the DNA; and (c) contacting the I-CreI-derivedmeganuclease with the DNA; whereby the I-CreI-derived meganucleasecleaves the DNA.

In another aspect, the invention provides methods for cleaving adouble-stranded DNA in vivo by (a) introducing into the DNA arecognition site for an I-CreI-derived meganuclease having a four basepair central sequence selected from the group consisting of centralsequences of Table 2; and (b) contacting the I-CreI-derived meganucleasewith the DNA; whereby the I-CreI-derived meganuclease cleaves the DNA.

In some embodiments, the DNA is present in a cell selected from abacterial, fungal, plant and animal cell.

In some embodiments, the DNA is present in a nucleic acid selected froma plasmid, a prophage and a chromosome.

In some of the foregoing embodiments, the four base pair DNA sequence isselected from Table 3. In other embodiments, the four base pair DNAsequence is selected from Table 4.

In some embodiments, the I-CreI-derived meganuclease is specificallydesigned for use with the chosen recognition site in the methods of theinvention.

In some of the foregoing embodiments, the method includes the additionalstep of rationally-designing the I-CreI-derived meganuclease torecognize the chosen recognition site. In some embodiments, the methodfurther comprises producing the I-CreI-derived meganuclease.

In another aspect, the invention provides cells transformed with anucleic acid including (a) a first 9 base pair DNA sequence which can bebound by an I-CreI-derived meganuclease monomer or by a first domainfrom a single-chain I-CreI-derived meganuclease; (b) a four base pairDNA sequence selected from Table 2; and (c) a second 9 base pair DNAsequence which can be bound by an I-CreI-derived meganuclease monomer orby a second domain from a single-chain I-CreI-derived meganuclease;wherein the second 9 base pair DNA sequence is in the reverseorientation relative to the first.

In another aspect, the invention provides a cell containing an exogenousnucleic acid sequence integrated into its genome, including, in order:(a) a first exogenous 9 base pair DNA sequence which can be bound by anI-CreI-derived meganuclease monomer or by a first domain from asingle-chain I-CreI-derived meganuclease; (b) an exogenous four basepair DNA sequence selected from Table 2; and (c) a second exogenous 9base pair DNA sequence which can be bound by an I-CreI-derivedmeganuclease monomer or by a second domain from a single-chainI-CreI-derived meganuclease; wherein the second 9 base pair DNA sequenceis in the reverse orientation relative to the first.

In another aspect, the invention provides a cell containing an exogenousnucleic acid sequence integrated into its genome, including, in order:(a) a first exogenous 9 base pair DNA sequence which can be bound by anI-CreI-derived meganuclease monomer or by a first domain from asingle-chain I-CreI-derived meganuclease; (b) an exogenous two base pairDNA sequence, wherein the two base pairs correspond to bases N₊₁ and N₊₂of a four base pair DNA sequence selected from Table 2; (c) an exogenousDNA sequence comprising a coding sequence which is expressed in thecell; (d) an exogenous two base pair DNA sequence, wherein the two basepairs correspond to bases N₊₃ and N₊₄ of a four base pair DNA sequenceselected from Table 2; and (e) a second exogenous 9 base pair DNAsequence which can be bound by the I-CreI-derived meganuclease monomeror by a second domain from the single-chain I-CreI-derived meganuclease;wherein the second 9 base pair DNA sequence is in the reverseorientation relative to the first.

In some embodiments, the nucleic acid is a plasmid. In otherembodiments, the nucleic acid is an artificial chromosome. In otherembodiments, the nucleic acid is integrated into the genomic DNA of thecell. In other embodiments, the nucleic acid is a viral nucleic acid.

In some embodiments, the cell is selected from the group a human cell, anon-human animal cell, a plant cell, a bacterial cell, and a fungalcell.

In some of the foregoing embodiments, the four base pair DNA sequence isselected from Table 3. In other embodiments, the four base pair DNAsequence is selected from Table 4.

In some embodiments, the I-CreI meganuclease is specifically designedfor use with the chosen recognition site in the methods of theinvention.

2.4 Methods of Conducting a Custom-Designed, I-CreI-Derived MeganucleaseBusiness

A meganuclease business can be conducted based on I-CreI-derivedmeganucleases. For example, such business can operate as following. Thebusiness received a DNA sequence into which a double-strand break is tobe introduced by a rationally-designed I CreI-derived meganuclease. Thebusiness identifies in the DNA sequence at least one recognition sitefor a rationally-designed I CreI-derived meganuclease with alteredspecificity relative to I-CreI, wherein the recognition site is notcleaved by a naturally-occurring I-CreI, wherein the recognition sitehas a four base pair central sequence selected from the group consistingof TTGT, TTAT, TCTT, TCGT, TCAT, GTTT, GTCT, GGAT, GAGT, GAAT, ATGT,TTTC, TTCC, TGAC, TAAC, GTTC, ATAT, TCGA, TTAA, GGGC, ACGC, CCGC, CTGC,ACAA, ATAA, AAGA, ACGA, ATGA, AAAC, AGAC, ATCC, ACTC, ATTC, ACAT, GAAA,GGAA, GTCA, GTTA, GAAC, ATAT, TCGA, TTAA, GCCC, GCGT, GCGG and GCAG. Thebusiness then provides a rationally-designed meganuclease that cleavesthe recognition site in the DNA.

Optionally, the business rationally-designs an I-CreI-derivedmeganuclease that cleaves the recognition site in the DNA. Optionally,the business produces the rationally-designed I-CreI-derivedmeganuclease.

2.5 Specifically Excluded Center Sequences.

The center sequences GTAC, ACAC, and GTGA have previously been shown tobe effective center sequences for in vitro and in vivo applications.These center sequences are specifically excluded from some aspects ofthe present invention. In addition, the center sequences TTGA and GAAAhave previously been shown to be poor center sequences for in vivoapplications (Arnould, et al. (2007). J. Mol. Biol. 371: 49-65).

EXAMPLES

This invention is further illustrated by the following examples, whichshould not be construed as limiting. Those skilled in the art willrecognize, or be able to ascertain, using no more than routineexperimentation, numerous equivalents to the specific substances andprocedures described herein. Such equivalents are intended to beencompassed in the scope of the claims that follow the examples below.Examples 1 and 2 refer to engineered meganucleases cleaving optimizedmeganuclease recognition sites in vivo in a model plant system. Example3 refers to an engineered meganuclease cleaving optimized meganucleaserecognition sites in vitro.

Example 1 Cleavage of an Optimized Meganuclease Recognition Site by aRationally-Designed, I-CreI-Derived Meganuclease Homodimer In Vivo

An engineered meganuclease called BRP2 (SEQ ID NO: 8) was produced usingthe method disclosed in WO 2007/047859. This meganuclease is derivedfrom I-CreI and was engineered to recognize DNA sites that are notrecognized by wild-type I-CreI (e.g., BRP2 recognition sequences includeSEQ ID NO: 9 and SEQ ID NO: 10, or SEQ ID NO: 11 and SEQ ID NO: 12). Tofacilitate nuclear localization of the engineered meganuclease, an SV40nuclear localization signal (NLS, SEQ ID NO: 13) was added to theN-terminus of the protein. Conventional Agrobacterium-mediatedtransformation procedures were used to transform Arabidopsis thalianawith a T-DNA containing a codon-optimized BRP2 coding sequence (SEQ IDNO: 14). Expression of BRP2 meganuclease was under the control of aHsp70 promoter and a NOS terminator. A pair of BRP2 recognitionsequences were housed on the same T-DNA separated by 7 base pairscontaining a PstI restriction enzyme site (FIG. 3 a ). BRP2 cutting ofthe pair of BRP2 recognition sequences in this construct was expected toexcise the region between the recognition sequences and thereby removethe PstI restriction site (FIG. 3 b ). Two such T-DNA constructs wereproduced which varied the center sequence of the meganucleaserecognition sequences flanking the PstI restriction enzyme site (FIG. 3c ). In the first construct (the “GTAC construct”), the meganucleaserecognition sites had the center sequence GTAC (a preferred in vivocenter sequence, Table 5, sequence 9; SEQ ID NO: 9 and SEQ ID NO 10).The second construct (the “TAGA construct”) had the center sequence TAGA(a non-preferred center sequence, opposite strand sequence to Table 2,sequence 77; SEQ ID NO: 11 and SEQ ID NO 12).

Stably transformed Arabidopsis plants carrying each construct wereproduced by selection for a kanamycin resistance marker housed on theT-DNA. Genomic DNA was then isolated from the transformed plants (byleaf punch) before and after heat-shock to induce BRP2 meganucleaseexpression. Genomic DNA samples were added to PCR reactions usingprimers to amplify the region of the T-DNA housing the meganucleaserecognition sequences. PCR products were then digested with PstI andvisualized by gel electrophoresis (FIG. 3 d ). Results are summarized inTable 6. Any PCR sample in which a significant percentage (>25%) ofproduct was found to be resistant to PstI was considered to beindicative of in vivo meganuclease cleavage in that particular plant andwas scored as “cut” in Table 6. It was found that, prior to heat-shock,the vast majority of PCR samples from plants carrying either constructretained the PstI site. After heat-shock, however, a large percentage ofsamples taken from plants transformed with the GTAC construct, but notthe TAGA construct, had lost the PstI site. PCR products from the GTACconstruct-transformed plants lacking a PstI site were cloned into apUC-19 plasmid and sequenced. 100% of sequenced clones had a precisedeletion of the region between the two BRP2 cut sites (as diagrammed inFIG. 3 b ). These results indicate that an engineered meganuclease isable to cleave a meganuclease recognition site in vivo provided it hasan optimized center sequence.

TABLE 6 In vivo cleavage of optimized meganuclease recognition sequencesby an engineered meganuclease homodimer. Before heat-shock Afterheat-shock Construct Cut Uncut Cut Uncut GTAC 0  4 3  1 TAGA 0 22 0 22

Example 2 Cleavage of an Optimized Meganuclease Recognition Site by aRationally-Designed, I-CreI-Derived Meganuclease Single-ChainHeterodimer In Vivo

The engineered meganuclease BRP12-SC (SEQ ID NO: 15) was produced inaccordance with WO 2007/047859, except that this meganuclease is asingle-chain heterodimer. As discussed in WO 2007/047859, wild-typeI-CreI binds to and cleaves DNA as a homodimer. As a consequence, thenatural recognition sequence for I-CreI is pseudo-palindromic. TheBRP12-SC recognition sequences, however, are non-palindromic (e.g., SEQID NO: 16 and SEQ ID NO: 17, or SEQ ID NO: 18 and SEQ ID NO: 19). Thisnecessitates the use of an engineered meganuclease heterodimercomprising a pair of subunits each of which recognizes one half-sitewithin the full-length recognition sequence. In the case of BRP12-SC,the two engineered meganuclease monomers are physically linked to oneanother using an amino acid linker to produce a single-chainheterodimer. This linker comprises amino acids 166-204 (SEQ ID NO: 20)of BRP12-SC. The linker sequence joins an N-terminal meganucleasesubunit terminated at L165 (corresponding to L155 of wild-type I-CreI)with a C-terminal meganuclease subunit starting at K204 (correspondingto K7 of wild-type I-CreI). The benefits of physically linking the twomeganuclease monomers using this novel linker is twofold: First, itensures that the meganuclease monomers can only associate with oneanother (heterodimerize) to cut the non-palindromic BRP12-SC recognitionsequence rather than also forming homodimers which can recognizepalindromic or pseudopalindromic DNA sites that differ from the BRP12-SCrecognition sequence. Second, the physical linking of meganucleasemonomers obviates the need to express two monomers simultaneously in thesame cell to obtain the desired heterodimer. This significantlysimplifies vector construction in that it only requires a single geneexpression cassette. As was the case with the BRP2 meganucleasediscussed in Example 1, the BRP12-SC meganuclease has an SV40 nuclearlocalization signal (SEQ ID NO: 13) at its N-terminus.

Conventional Agrobacterium-mediated transformation procedures were usedto transform Arabidopsis thaliana with a T-DNA containing acodon-optimized BRP12-SC coding sequence (SEQ ID NO: 21). Expression ofBRP12-SC meganuclease was under the control of a Hsp70 promoter and aNOS terminator. A pair of BRP12-SC recognition sequences were housed onthe same T-DNA separated by 7 base pairs containing a PstI restrictionenzyme site (FIG. 4 a ). BRP12-SC cutting of the pair of BRP12-SCrecognition sequences in this construct was expected to excise theregion between the recognition sequences and thereby remove the PstIrestriction site (FIG. 4 b ). Two such T-DNA constructs were producedwhich varied only in the center sequences of the meganucleaserecognition sequences flanking the PstI restriction enzyme site (FIG. 4c ). In the first construct (the “GTAC construct”), the meganucleaserecognition sites had the center sequence GTAC (a preferred in vivocenter sequence, Table 5, sequence 9; SEQ ID NO: 16 and SEQ ID NO 17).The second construct (the “TAGA construct”) had the center sequence TAGA(a non-preferred center sequence, opposite strand sequence to Table 2,sequence 77; SEQ ID NO: 18 and SEQ ID NO 19).

Stably transformed Arabidopsis plants carrying each construct wereproduced by selection for a kanamycin resistance marker housed on theT-DNA. Genomic DNA was then isolated from the transformed plants (byleaf punch) before and after heat-shock to induce BRP12-SC meganucleaseexpression. Genomic DNA samples were added to PCR reactions usingprimers to amplify the region of the T-DNA housing the meganucleaserecognition sequences. PCR products were then digested with PstI andvisualized by gel electrophoresis. The results of this analysis arepresented in Table 7. Any PCR sample in which a significant percentage(>25%) of product was found to be resistant to PstI was considered to beindicative of in vivo meganuclease cleavage and was scored as “cut” inTable 7. It was found that, prior to heat-shock, the vast majority ofPCR samples from plants carrying either construct retained the PstIsite. After heat-shock, however, a large percentage of samples takenfrom plants transformed with the GTAC construct, but not the TAGAconstruct, had lost the PstI site. PCR products from the GTACconstruct-transformed plants lacking a PstI site were cloned into apUC-19 plasmid and sequenced. 100% of sequenced clones had a precisedeletion of the region between the two BRP12-SC cut sites (as diagrammedin FIG. 4 b ). These results indicate that an engineered single chainmeganuclease is able to cleave a meganuclease recognition site in vivoprovided it has an optimized center sequence.

TABLE 7 In vivo cleavage of optimized meganuclease recognition sequencesby an engineered meganuclease homodimer. Before heat-shock Afterheat-shock Construct Cut Uncut Cut Uncut GTAC 0 23 8 15 TAGA 0 59 1 58

Example 3 Cleavage of an Optimized Meganuclease Recognition Site by aRationally-Designed, I-CreI-Derived Meganuclease Homodimer In Vitro

The BRP2 meganuclease described in Example 1 (SEQ ID NO: 8) wasexpressed in E. coli and purified as in Example 1 of WO 2007/047859. Thepurified meganuclease was then added at varying concentrations toreactions containing plasmids harboring BRP2 recognition sequences witheither a GTAC or TAGA center sequence (0.25 picomoles of plasmidsubstrate in 25 microliters of SA buffer: 25 mM Tris-HCL, pH 8.0, 100 mMNaCl, 5 mM MgCl₂, 5 mM EDTA). Reactions were incubated at 37° C. for 2hours and were then visualized by gel electrophoresis and the percentageof each plasmid substrate cleaved by the meganuclease was plotted as afunction of meganuclease concentration (FIG. 5 ). It was found that theplasmid substrate with the TAGA center sequence was cleaved by themeganuclease in vitro, but that cleavage of this substrate required afar higher concentration of BRP2 meganuclease than did cleavage of theGTAC substrate.

SEQUENCE LISTING (wild-type I-Crel, Genbank Accession # P05725)SEQ ID NO: 1   1 MNTKYNKEFL LYLAGFVDGD GSIIAQIKPN QSYKFKHQLS LAFQVTQKTQ RRWFLDKLVD  61 EIGVGYVRDR GSVSDYILSE IKPLHNFLTQ LQPFLKLKQK QANLVLKIIW RLPSAKESPD 121 KFLEVCTWVD QIAALNDSKT RKTTSETVRA VLDSLSEKKK SSP(wild-type I-Crel recognition sequence) SEQ ID NO: 2   1 GAAACTGTCT CACGACGTTT TG (wild-type I-Crel recognition sequence)SEQ ID NO: 3    1 CAAAACGTCG TGAGACAGTT TC (DJ1 amino acid sequence)SEQ ID NO: 4   1 MNTKYNKEFL LYLAGFVDGD GSIIAQIDPR QNYKFKHELR LRFQVTQKTQ RRWFLDKLVD  61 EIGVGYVRDR GSVSDYILSE IKPLHNFLTQ LQPFLKLKQK QANLVLKIIE QLPSAKESPD 121 KFLEVCTWVD QIAALNDSKT RKTTSETVRA VLDSLSEKKK SSP(DJI recognition sequence-GTGA center sequence ) SEQ ID NO: 5   1 AACGGTGTCG TGAGACACCG TT(DJI recognition sequence-GTGA center sequence) SEQ ID NO: 6   1 AACGGTGTCT CACGACACCG TT (DJI half-site) SEQ ID NO: 7   1 AACGGTGTC (BRP2 amino acid sequence) SEQ ID NO: 8   1 MGPKKKRKVI MNTKYNKEFL LYLAGFVDGD GSIIASIRPR QSCKFKHELE  51 LRFQVTQKTQ RRWFLDKLVD EIGVGYVRDR GSVSDYRLSQ IKPLHNFLTQ 101 LQPFLKLKQK QANLVLKIIE QLPSAKESPD KFLEVCTWVD QIAALNDSKT 151 RKTTSETVRA VLDSLSEKKK SSP(BRP2 recognition sequence-GTAC center sequence) SEQ ID NO: 9   1 CTCCGGGTCG TACGACCCGG AG(BRP2 recognition sequence-GTAC center sequence) SEQ ID NO: 10   1 CTCCGGGTCG TACGACCCGG AG(BRP2 recognition sequence-TAGA center sequence) SEQ ID NO: 11   1 CTCCGGGTCT AGAGACCCGG AG(BRP2 recognition sequence-TAGA center sequence) SEQ ID NO: 12   1 CTCCGGGTCT CTAGACCCGG AG(SV40 nuclear localization signal amino acid sequence) SEQ ID NO: 13   1 MAPKKKRKV (BRP2 codon-optimized DNA sequence) SEQ ID NO: 14   1 ATGGGCCCGA AGAAGAAGCG CAAGGTCATC ATGAACACCA AGTACAACAA  51 GGAGTTCCTG CTCTACCTGG CGGGCTTCGT GGACGGGGAC GGCTCCATCA 101 TCGCCTCCAT CCGCCCGCGT CAGTCCTGCA AGTTCAAGCA TGAGCTGGAA 151 CTCCGGTTCC AGGTCACGCA GAAGACACAG CGCCGTTGGT TCCTCGACAA 201 GCTGGTGGAC GAGATCGGGG TGGGCTACGT GCGCGACCGC GGCAGCGTCT 251 CCGACTACCG CCTGAGCCAG ATCAAGCCTC TGCACAACTT CCTGACCCAG 301 CTCCAGCCCT TCCTGAAGCT CAAGCAGAAG CAGGCCAACC TCGTGCTGAA 351 GATCATCGAG CAGCTGCCCT CCGCCAAGGA ATCCCCGGAC AAGTTCCTGG 401 AGGTGTGCAC CTGGGTGGAC CAGATCGCCG CTCTGAACGA CTCCAAGACC 451 CGCAAGACCA CTTCCGAGAC CGTCCGCGCC GTGCTGGACA GTCTCTCCGA 501 GAAGAAGAAG TCGTCCCCCT AG (BRP12-SC amino acid sequence)SEQ ID NO: 15   1 MGPKKKRKVI MNTKYNKEFL LYLAGFVDGD GSIKAQIRPR QSRKFKHELE  51 LTFQVTQKTQ RRWFLDKLVD EIGVGKVYDR GSVSDYELSQ IKPLHNFLTQ 101 LQPFLKLKQK QANLVLKIIE QLPSAKESPD KFLEVCTWVD QIAALNDSKT 151 RKTTSETVRA VLDSLPGSVG GLSPSQASSA ASSASSSPGS GISEALRAGA 201 TKSKEFLLYL AGFVDGDGSI IASIRPRQSC KFKHELELRF QVTQKTQRRW 251 FLDKLVDEIG VGYVRDRGSV SDYRLSQIKP LHNFLTQLQP FLKLKQKQAN 301 LVLKIIEQLP SAKESPDKFL EVCTWVDQIA ALNDSKTRKT TSETVRAVLD 351 SLSEKKKSSP (BRP12-SC recognition sequence-GTAC center sequence)SEQ ID NO: 16    1 TGCCTCCTCG TACGACCCGG AG(BRP12-SC recognition sequence-GTAC center sequence) SEQ ID NO: 17   1 CTCCGGGTCG TACGAGGAGG CA(BRP12-SC recognition sequence-TAGA center sequence)   1 TGCCTCCTCT AGAGACCCGG AG SEQ ID NO: 18(BRP12-SC recognition sequence-TAGA center sequence) SEQ ID NO: 19   1 CTCCGGGTCT CTAGAGGAGG CA (BRP12-SC linker amino acid sequence)SEQ ID NO: 20    1 PGSVGGLSPS QASSAASSAS SSPGSGISEA LRAGATKS(BRP12-SC codon-optimized DNA sequence) SEQ ID NO: 21   1 ATGGGCCCGA AGAAGAAGCG CAAGGTCATC ATGAACACCA AGTACAACAA  51 GGAGTTCCTG CTCTACCTGG CCGGCTTCGT GGACGGCGAC GGCTCCATCA 101 AGGCGCAGAT CCGTCCGCGG CAGAGCCGGA AGTTCAAGCA CGAGCTCGAG 151 CTGACCTTCC AGGTGACCCA GAAGACGCAG AGGCGCTGGT TCCTCGACAA 201 GCTGGTGGAC GAGATCGGGG TGGGCAAGGT CTACGACCGC GGGTCGGTGT 251 CCGACTACGA GCTCTCCCAG ATCAAGCCCC TGCACAACTT CCTCACCCAG 301 CTCCAGCCGT TCCTGAAGCT CAAGCAGAAG CAGGCCAACC TCGTGCTGAA 351 GATCATCGAG CAGCTGCCCT CCGCCAAGGA ATCCCCGGAC AAGTTCCTGG 401 AGGTGTGCAC GTGGGTGGAC CAGATCGCGG CCCTCAACGA CAGCAAGACC 451 CGCAAGACGA CCTCGGAGAC GGTGCGGGCG GTCCTGGACT CCCTCCCAGG 501 ATCCGTGGGA GGTCTATCGC CATCTCAGGC ATCCAGCGCC GCATCCTCGG 551 CTTCCTCAAG CCCGGGTTCA GGGATCTCCG AAGCACTCAG AGCTGGAGCA 601 ACTAAGTCCA AGGAATTCCT GCTCTACCTG GCGGGCTTCG TGGACGGGGA 651 CGGCTCCATC ATCGCCTCCA TCCGCCCGCG TCAGTCCTGC AAGTTCAAGC 701 ATGAGCTGGA ACTCCGGTTC CAGGTCACGC AGAAGACACA GCGCCGTTGG 751 TTCCTCGACA AGCTGGTGGA CGAGATCGGG GTGGGCTACG TGCGCGACCG 801 CGGCAGCGTC TCCGACTACC GCCTGAGCCA GATCAAGCCT CTGCACAACT 851 TCCTGACCCA GCTCCAGCCC TTCCTGAAGC TCAAGCAGAA GCAGGCCAAC 901 CTCGTGCTGA AGATCATCGA GCAGCTGCCC TCCGCCAAGG AATCCCCGGA 951 CAAGTTCCTG GAGGTGTGCA CCTGGGTGGA CCAGATCGCC GCTCTGAACG1001 ACTCCAAGAC CCGCAAGACC ACTTCCGAGA CCGTCCGCGC CGTGCTGGAC1051 AGTCTCTCCG AGAAGAAGAA GTCGTCCCCC TAG (recognition sequence)SEQ ID NO: 22    1 aacggtgtcn nnngacaccg tt (recognition sequence)SEQ ID NO: 23    1 aacggtgtcn nnngacaccg tt(“LAGLIDADG” family motif peptide) SEQ ID NO: 24    1 LAGLIDADG

1. A method for cleaving a double-stranded DNA comprising: (a)identifying in said DNA at least one recognition site for arationally-designed I CreI-derived meganuclease with altered specificityrelative to I-CreI, wherein said recognition site is not cleaved by anaturally-occurring I-CreI, wherein said recognition site has a fourbase pair central sequence selected from the group consisting of ACAC,ACAT, and ATAT; (b) providing said rationally-designed meganuclease; and(c) contacting said DNA with said rationally-designed meganuclease;whereby said rationally-designed meganuclease cleaves said DNA.
 2. Themethod of claim 1, wherein said DNA cleavage is in vitro.
 3. The methodof claim 1, wherein said DNA is selected from the group consisting of aPCR product; an artificial chromosome; genomic DNA isolated frombacteria, fungi, plants, or animal cells; and viral DNA.
 4. The methodof claim 1, wherein said DNA cleavage is in vivo.
 5. The method of claim4, wherein said DNA is present in a cell selected from the groupconsisting of a bacterial, fungal, plant and animal cell.
 6. The methodof claim 4, wherein said DNA is present in a nucleic acid selected fromthe group consisting of a plasmid, a prophage and a chromosome.
 7. Themethod of claim 1, further comprising rationally-designing said I CreIderived meganuclease to recognize said recognition site.
 8. The methodof claim 1, further comprising producing said rationally-designedI-CreI-derived meganuclease.
 9. The method of claim 1, wherein saidrecognition site has a four base pair central sequence consisting ofACAC.
 10. The method of claim 1 wherein said recognition site has a fourbase pair central sequence consisting of ACAT.
 11. The method of claim 1wherein said recognition site has a four base pair central sequenceconsisting of ATAT.